SenseDefs: a multilingual corpus of semantically annotated textual definitions
نویسندگان
چکیده
منابع مشابه
PACE Corpus: a multilingual corpus of Polarity-annotated textual data from the domains Automotive and CEllphone
In this paper, we describe a publicly available multilingual evaluation corpus for phrase-level Sentiment Analysis that can be used to evaluate real world applications in an industrial context. This corpus contains data from English and German Internet forums (1000 posts each) focusing on the automotive domain. The major topic of the corpus is connecting and using cellphones to/in cars. The pre...
متن کاملA Semantically Annotated Swedish Medical Corpus
With the information overload in the life sciences there is an increasing need for annotated corpora, particularly with biological and biomedical entities, which is the driving force for data-driven language processing applications and the empirical approach to language study. Inspired by the work in the GENIA Corpus, which is one of the very few of such corpora, extensively used in the biomedi...
متن کاملDeveloping a large semantically annotated corpus
What would be a good method to provide a large collection of semantically annotated texts with formal, deep semantics rather than shallow? We argue that a bootstrapping approach comprising state-of-the-art NLP tools for parsing and semantic interpretation, in combination with a wiki-like interface for collaborative annotation of experts, and a game with a purpose for crowdsourcing, are the star...
متن کاملYAWN: A Semantically Annotated Wikipedia XML Corpus
The paper presents YAWN, a system to convert the well-known and widely used Wikipedia collection into an XML corpus with semantically rich, self-explaining tags. We introduce algorithms to annotate pages and links with concepts from the WordNet thesaurus. This annotation process exploits categorical information in Wikipedia, which is a high-quality, manually assigned source of information, extr...
متن کاملA Semantically Annotated Corpus from MEDLINE Abstracts
Automatic information extraction is a key technology to help researchers access the information contained in research papers and to extend databases on substances and biological processes. We aim to build information extraction databases [2] from biochemical papers and their abstracts available from the MEDLINE [3] database. To objectively measure the performance of our systems, we built a corp...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Language Resources and Evaluation
سال: 2018
ISSN: 1574-020X,1574-0218
DOI: 10.1007/s10579-018-9421-3